Tachyon: Memory Throughput I/O for Cluster Computing Frameworks

نویسندگان

  • Haoyuan Li
  • Ali Ghodsi
  • Matei Zaharia
  • Eric Baldeschwieler
  • Scott Shenker
  • Ion Stoica
چکیده

As ever more big data computations start to be in-memory, I/O throughput dominates the running times of many workloads. For distributed storage, the read throughput can be improved using caching, however, the write throughput is limited by both disk and network bandwidth due to data replication for fault-tolerance. This paper proposes a new file system architecture to enable frameworks to both read and write reliably at memory speed, by avoiding synchronous data replication on writes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reliable, Memory Speed Storage for Cluster Computing Frameworks

Tachyon is a distributed file system enabling reliable data sharing at memory speed across cluster computing frameworks. While caching today improves read workloads, writes are either network or disk bound, as replication is used for fault-tolerance. Tachyon eliminates this bottleneck by pushing lineage, a well-known technique borrowed from application frameworks, into the storage layer. The ke...

متن کامل

Scheduling for Improved Write Performance in a Cost-Effective, Fault- Tolerant Parallel Virtual File System (CEFT-PVFS)

Without any additional hardware, CEFT-PVFS utilizes the existing disks on each cluster node to provide RAID-10 style parallel I/O service. In CEFT-PVFS, all servers are also computational nodes and can be heavily loaded by different applications running on the cluster, thus potentially degrading the I/O performance. To minimize the degradation, I/O requests can be scheduled on a less loaded ser...

متن کامل

Node.Scala: Implicit Parallel Programming for High-Performance Web Services

Event-driven programming frameworks such as Node.JS have recently emerged as a promising option for Web service development. Such frameworks feature a simple programming model with implicit parallelism and asynchronous I/O. The benefits of the eventbased programming model in terms of concurrency management need to be balanced against its limitations in terms of scalability on multicore architec...

متن کامل

A light-weight, collaborative temporary file system for clustered Web servers

Previous studies indicate that I/O could become a performance bottleneck in commodity PC-based cluster Web servers. Current local native file systems do not work well for expensive file I/Os while specialized file systems have a limitation on portability. In this paper, we present a lightweight, collaborative temporary file system (CTFS) to improve disk I/O performance for clustered Web servers...

متن کامل

File System Workload Analysis For Large Scale Scientific Computing Applications

Parallel scientific applications require high-performance I/O support from underlying file systems. A comprehensive understanding of the expected workload is therefore essential for the design of high-performance parallel file systems. We re-examine the workload characteristics in parallel computing environments in the light of recent technology advances and new applications. We analyze applica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013